## Gradient Descent Neural Network sigmoid
Here's a simplified neural network architecture:

1. Input layer: One neuron with input $x$.
2. Hidden layer: One neuron with weight $w$ and bias $b$, applying the sigmoid activation function.
3. Output layer: One neuron with weight $v$ and bias $c$, applying the linear (identity) activation function.

The network's output $y$ can be expressed as:

$$y = v \cdot \text{sigmoid}(w \cdot x + b) + c$$

Your goal is to use gradient descent to train the network by updating the weights and biases to minimize a cost function $J(\theta)$, where $\theta$ represents all the network parameters ($w$, $b$, $v$, and $c$).

**Initialization**:
- Initialize the parameters randomly or with predetermined values, e.g., $w = 0.5$, $b = 0.2$, $v = 0.3$, $c = 0.1$.
- Set the learning rate $\alpha$ (e.g., $\alpha = 0.1$).

**Training**:
1. For each training example, compute the predicted output $y$ using the current network parameters:

   $$y = v \cdot \text{sigmoid}(w \cdot x + b) + c$$

2. Compute the cost function $J(\theta)$ (e.g., mean squared error) between the predicted output $y$ and the actual target output $y_{\text{target}}$:

   $$J(\theta) = \frac{1}{2}(y - y_{\text{target}})^2$$

3. Compute the gradients of the cost function with respect to the parameters $\theta$ using backpropagation. For example:

   - $\frac{\partial J}{\partial v} = y - y_{\text{target}}$
   - $\frac{\partial J}{\partial c} = y - y_{\text{target}}$
   - $\frac{\partial J}{\partial w} = \frac{\partial J}{\partial y} \cdot \frac{\partial y}{\partial \text{sigmoid}} \cdot \frac{\partial \text{sigmoid}}{\partial(w \cdot x + b)} \cdot \frac{\partial(w \cdot x + b)}{\partial w}$
   - $\frac{\partial J}{\partial b} = \frac{\partial J}{\partial y} \cdot \frac{\partial y}{\partial \text{sigmoid}} \cdot \frac{\partial \text{sigmoid}}{\partial(w \cdot x + b)} \cdot \frac{\partial(w \cdot x + b)}{\partial b}$

4. Update the parameters using gradient descent:

   - $v = v - \alpha \cdot \frac{\partial J}{\partial v}$
   - $c = c - \alpha \cdot \frac{\partial J}{\partial c}$
   - $w = w - \alpha \cdot \frac{\partial J}{\partial w}$
   - $b = b - \alpha \cdot \frac{\partial J}{\partial b}$

Repeat steps 1-4 for multiple iterations and examples until the cost function converges to a minimum, indicating that the network has been trained.
